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Abstract: In optical character recognition and 
document image analysis skew is introduced in coming 
documented image. Which degrade the performance of 
OCR and image analysis system so to detection and 
correction of skew angle is important step of 
preprocessing of document analysis. Many methods 
have been proposed by researchers for the detection of 
skew in binary image documents. The majority of them 
are based on Projection profile, Fourier transform, 
and cross-correlation, Hough transform, Nearest 
Neighbor connectivity, linear regression analysis and 
mathematical morphology. Main advantage of Hough 
transform is its accuracy and simplicity. But due to 
slow speed many researchers work on its speed 
complexity without compromising the accuracy. So, for 
improving computational efficiency of Hough 
transform there are various variations have been 
proposed by researchers to reduce the computational 
time for skew angle. In this Paper we introduced new 
method which reduces the time complexity without 
compromising the accuracy of Hough transform. 
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I. INTRODUCTION 

Document image processing has become an 
increasingly important technology in the automation of 
office documentation tasks. Automatic document 
scanners such as text readers and OCR (Optical 
Character Recognition) systems are an essential 
component of systems capable of those tasks. One of 
the problems in this field is that the document to be 
read is not always placed correctly on a flatbed 
scanner. This means that the document may be skewed 
on the scanner bed, resulting in a skewed image. Skew 
is any deviation of the image from that of the original 
document, which is not parallel to the horizontal or 
vertical. Skew Correction remains one of the vital parts 
in Document Processing. Many methods have been 
proposed by researchers for the detection of skew in 
binary image documents [1]. This skew has a 
detrimental effect on document analysis, document 
understanding, and character segmentation and 



recognition. Consequently, detecting the skew of a 
document image and correcting it are important issues 
in realizing a practical document reader. It included the 
skew which degrade the performance OCR system. So, 
to increase the performance of OCR system we must 
detect the skew as well as correct the skew. Normally, 
when skew is detected and main work is done by 
researchers to rotate into opposite direction. There are 
various methods for detecting the skew which are like 
projection profile, Fourier transform, Hough 
transform, nearest neighbour connectivity, linear 
regression analysis and mathematical morphology so 
different researchers have to use different methods to 
solve this problem. Main advantage of Hough 
transform is its accuracy and simplicity. But due to 
slow speed many researchers work on its speed 
complexity without compromising the accuracy. So, 
for improving computational efficiency of Hough 
transform there are various variations have been 
proposed by researchers to reduce the computational 
time for skew angle. There are basically three types of 
skew in the images like on the basis on number of 
skew angle and orientation three types of skew 
upcoming in scanning the document: 

1. Global Skew: this come when document have 
common degree angle orientation. 

2. Multiple Skew: documents have different degree 
of orientation in the different contents. 

3. Non-uniform text line skew: when documents 
contain several orientation in the single line [11]. 

Ye and Jain (J 996} used with a fast aj 
images. They use hierarchical Hough trar 
Firstly algorithms efficiently computing < 
using Block adjacency graph then Hour 
angular resolutions Amin and Fischer us 
stage blocks of HXTlike captions of pi* 
the skew angle for each blocks and fitti 
increasing the speed only bottom line of 
Hough transform firstly and determine Uz 
Manjunth at a (2006) also used Hough t: 
Figure 1 Skewed image with 2 degree angle 
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Figure 1 is skewed images which are deflected from 
its normal angle by 2 degree as shown. In Figure 2, the 
skew angle is removed and hence we get the images in 
its correct form. There are various methods available 
for the detection and correction of skew angle. Each 
and every method has own advantages and 
disadvantages on the basis of we can calculate the 
efficiency of any particular algorithm. 

Ye and Jain (1996) used with a fast aj 
images. They use hierarchical Hough trai 
Firstly algorithms efficiently computing < 
using Block adjacency graph then Houj 
angular resolutions. Amin and Fischer us 
stage blocks of HXT like captions of pic 
the skew angle for each blocks and fitt: 
increasing the speed only bottom line of 
Hough transform firstly and determine bl: 
Manjunth at a. (2006) also used Hough t 

Figure 2.Documented image by rotation 2 degree angle 

II. RELATED WORK 

Generally, there are a variety of global skew 
detection and correction techniques available. Most of 
these techniques are reviewed by Hull [1]. Broadly 
skew estimation approaches are classified into basic 
categories. It includes projection profile, Hough 
transforms, nearest neighbour clustering, and cross 
correlation. Historically, Hough transform based 
document skew detection and correction are proposed 
in Srihari and Govindaraju (1989) [2]. They calculate 
Hough transform at all angles of between and 180. 
A heuristic measures the rate of change in accumulator 
values at each value of 0. The skew angle is set to the 
value of theta that maximizes the heuristic [3]. Hinds 
et al. (1990) use Hough transform and run length 
encoding to estimate the document skew. Additionally, 
they reduce data with the use of horizontal and vertical 
run length computations. The document image, 
acquires at 300 dpi, is under sampled by a factor of 4 
and transformed into a burst image. This image is built 
by replacing each vertical black run with its length 
placed in the bottom-most pixel of the run. The Hough 
transform is then applied to all the pixels in the burst 
image that have value less than 25, aiming at 
discarding contributes of non-textual components [4]. 
The bin with maximum value in the Hough space 
determines the skew angle Jiang et al. used Hough 
transform with detecting points in coarse form and 
accurate skew is obtained by choosing peak value for 
skew angle [5]. Yu and Jain used a fast and accurate 
approach on set of low resolution images. They use 
hierarchical Hough transform and centroids of 
connected components. Firstly algorithms efficiently 
computing connected components and at their 



centroids by using block adjacency graph then Hough 
transform is applied to centroids using two angular 
resolutions [6] Spitz et al. used the data reductions 
techniques that used for compressed images, in which 
data points are obtained with single pass and mapped 
into Hough space [7]. Chaudhary and Pal have 
proposed a technique for Indian language scripts in 
which exploits the inherent properties of the script to 
determine the skew angle. The idea is to detect skew 
angles of these head lines of scripts. The method 
detecting skew angles in range (-45° to 45°) [8]. Amin 
and Fischer (2000) apply Hough transform to de-skew 
the document image in two stages. First, blocks of text, 
such as paragraphs and captions of pictures are 
identified. Next, they calculate skew angle for each 
block by fitting straight lines using least square 
method, only the bottom line of a block is considered 
for skew detection in order to enhance the speed [9]. 
Singh et al. have purposed new algorithm which 
speeds up the performance of classic Hough transform. 
Mainly, this new algorithm converts the voting 
procedure to hierarchy based voting method which 
speeds up the performance and reduce the space 
requirements. They perform fast Hough transform in 
which three sub processes are done. Firstly in pre- 
processing stage block adjacency graph is used. Then 
in voting process done using Hough transform and at 
finally, skew angle is corrected by rotation. But BAG 
based algorithm is found to be effective for Roman 
Scripts documents and is not satisfactory for Indian 
scripts where headline is part of the script. So, this 
approach is script dependent [10]. Manjunath et al. 
[11] also used Hough transform to detect the skew 
angle in two steps. Initially, they identified character 
blocks from document images and thinning process is 
performed over all regions. Then next thinned 
conditions are fed to Hough transform. The primary 
disadvantage of this technique is that time complexity 
does not include the thinning process time. Ruilin 
Zhang et al. uses the Hough transform in fabric images 
for skew detection using the multi-threshold analysis 
[12]. The principal of Hough transform for skew 
detection is analyzed in this paper and describes how 
to apply the method of using Hough transform 
combining with the Sobel operator in skew detection. 

III. HOUGH TRANSFORM 

Firstly Hough transform is the linear transform for 
detecting straight lines. In the image representation 
there is image space, in which the straight line can be 
represented by equation y = mx + b and can be 
graphically plotted for each pair of image points (x, y). 
In the Hough transform, the main idea is to consider 
the characteristics of the straight line not as image 
points x or y, but in terms of its parameters, here the 
slope parameter m and the intercept parameter b. 
Based on that fact, the straight line y = mx + b can be 
represented as a point (b, m) in the parameter space. 
However, one faces the problem that vertical lines give 
rise to unbounded values of the parameters m and b. 
For computational reasons, it is therefore better to 
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parameterize the lines in the Hough transform with 
two other parameters, commonly referred to as p (rho) 
and (theta). In which line can be represented 
Cartesian equation x .cos 0i + y. sin 0i = pi .Where the 
parameter p represents the distance between the line 
and the origin, and is the angle of the vector from the 
origin to this closest point. Figure 3 shows the 
parameter plane of p and 0. In which X and Y are axis 
and p is distance and the angle .but the Cartesian 
equation is slow for accumulating process than slope 
and intercept equations. 




Figure 3 parameter plane ofp and 6 

The Hough transform accepts the input in the form 
of a binary edge map and find edges which are 
positioned likes straight lines. The idea of the Hough 
transform is that every edge point in the edge map is 
transformed to all possible lines that could pass 
through that point. The line detection in a binary image 
using the Hough transform algorithm is below: 

1. Select the Hough transform parameters pmin, 
pmax, 0min and 0max. 

2. Quantize the (p,0) plane into cells by forming an 
accumulator cell array A (p,0), where p is between 
pmin and pmax, and is between 0min and 0max. 

3. Assigning the element of an accumulator cell array 
A to zero. 

4. For each black pixel in a binary image, perform the 
following: 

For each value of 0i from min to max, calculate the 
corresponding pi using the equation: x .cos0i + y. 
sin 0i = pi Round off the pi value to the nearest 
allowed p value. Updating the accumulator array 
element A ( pi, 0i) by voting procedure. 

5. In last, local maxima in the accumulator cell array 
correspond to a number of points lying in a 
corresponding line in the binary image. 

The running cost is O (n*A), where n is number of 
points and A is number of different values of angles. 
So more accuracy we need, then more fine angle 
intervals we have to use and hence more different 
values for angle, and more the running time. 

IV. METHODS FOR INCREASE THE SPEED OF 
HOUGH TRANSFORM 

1. Converting floating operations to integer 
operations: - in this method we converted the 
floating point operations into integer operations 
which increase the speed of Hough transforms .but 
accuracy is affected so maintain the accuracy we 



must use the nearest integer results of float 
operation 

2. Pre-computations:- Many operations which are 
repetitive in detecting skew angle. That can be 
precomputed and stored into array so in this way 
we reduced the number of calculations. 

3. Using Hierarchical approach: -The main idea of the 
above methods is to reduce the amount of Input 
data. In this method researchers used coarser 
Hough space in which only rough estimate is 
considered. This approach is equally suitable for 
handwritten documents. [6]. 

4. Using BAG algorithm: - In this method input data 
is reduced by taking centroids of connected 
components rather than use of all image pixels 
[11]. 

5. Rotation: - Singh at al [2008] shows that there are 
two type of rotation which is forward rotation 
inverse rotation. We generally expect that results 
of both rotations are same but he has observed that 
results are not same .So he concluded that time 
taken by forward rotation is less than inverse 
rotations. But quality of rotated images is higher in 
inverse rotation than forward rotation at special 
conditions. 

V. PROPOSED SOLUTION 

Our skew detection approach will be based on a 
technique involving Modified Hough Transform to 
detect the skew. We apply Hough transform (HT) to 
the set of pixels. We apply HT with a modified 
technique so that the total time taken by the algorithm 
gets reduced keeping the accuracy of the process 
intact. We divide the spectrum of the HT space i.e., 
angle of skew which can be degree to 45 degree into 
one-tenths, thus getting the portion in which the 
resultant skew lies. Then only that portion is further 
investigated by diving it into one-tenths and so on. 
This way the algorithm reaches the solution quickly as 
compared to the classical HT. 
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Figure 4: Representation of the proposed technique. 
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Figure 4 depicts the proposed process. First HT is 
applied for angles from degree to 45 degree with a 
step of 4.5 degrees. Assume that the portion that 
attracts the maximum votes is the angle from a degree 
to b degree. Then, only the portion from a to b degrees 
is further explored using HT with higher resolution. 

VI. CONCLUSION 

There are different methods for document image 
skew detection. These included projection profiles 
which used different angles directly from image data, 
methods that calculated projection profiles from image 
features, and second algorithms that used the Hough 
transform. On which we calculated the skew angle for 
straight Line and other parametric curves another class 
of technique extracted features with local, directionally 
sensitive masks. The Speed of Hough transform is 
slow but have anti interference capability so it is used 
mostly in this paper we reviewed various variations of 
Hough transform each methods have their own speed 
for different scripts .Only preliminary efforts have 
been conducted in comparative performance 
evaluation. Further work in this area could help show 
the performance of proposed solution. 
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